依靠这样的前提是,二进制神经网络的性能可以在很大程度上恢复,而完全精确的权重向量与其相应的二进制向量之间的量化错误,网络二线化的现有作品经常采用模型鲁棒性的想法以达到上述目标。但是,鲁棒性仍然是一个不明智的概念,而没有扎实的理论支持。在这项工作中,我们介绍了Lipschitz的连续性,即定义明确的功能特性,是定义BNN模型鲁棒性的严格标准。然后,我们建议将Lipschitz连续性保留为正规化项,以提高模型的鲁棒性。特别是,虽然流行的Lipschitz涉及正则化方法由于其极端稀疏而经常在BNN中崩溃,但我们将保留矩阵设计以近似于目标重量矩阵的光谱规范,可以将其作为BNN的Lipschitz常数的近似值部署精确的L​​ipschitz恒定计算(NP-HARD)。我们的实验证明,我们的BNN特异性正则化方法可以有效地增强BNN的鲁棒性(在Imagenet-C上作证),从而在CIFAR和Imagenet上实现最新性能。
translated by 谷歌翻译
神经网络二进制通过将其权重和激活量化为1位来加速深层模型。但是,二进制神经网络(BNN)与其完整精确(FP)对应物之间仍然存在巨大的性能差距。由于早期作品中权重二进制引起的量化误差已减少,因此激活二进化成为进一步提高准确性的主要障碍。 BNN表征了独特而有趣的结构,其中二进制和潜在的fp激活存在于同一正向通行证中(\ textit {i.e。} $ \ text {binarize}(\ mathbf {a} _f {a} _f)= \ mathbf {a a} _b $) 。为了减轻从FP到二元激活的二进化操作引起的信息降解,我们在通过互信息(MI)最大化的镜头训练BNN时建立了一种新颖的对比学习框架。将MI作为指标引入,以衡量二进制和FP激活之间共享的信息,这有助于对比度学习。具体而言,通过从相同输入样品中拉出二进制和FP激活的正对,以及从不同样品中推动负面对(负面对数的数量可以大大),从而极大地增强了BNN的表示能力。这使下游任务不仅有益于分类,而且还受益于分类和深度估计,〜\ textit {etc}。实验结果表明,我们的方法可以作为现有最新二元方法的堆积模块实现NYUD-V2的能力。
translated by 谷歌翻译
社交媒体由于易于传播新信息而在公共领域迅速发展,这导致了谣言的流通。但是,从如此大量的信息中发现谣言正在成为越来越艰巨的挑战。以前的工作通常从传播信息中获得了宝贵的功能。应该注意的是,大多数方法仅针对传播结构,而忽略了谣言传播模式。这个有限的重点严重限制了传播数据的收集。为了解决这个问题,本研究的作者是促使探索谣言的区域化传播模式。具体而言,提出了一种新颖的区域增强的深图卷积网络(RDGCN),该网络(RDGCN)通过学习区域化的传播模式和火车来增强谣言的传播特征,从而通过无人看管的学习来学习传播模式。此外,源增强的残留图卷积层(SRGCL)旨在改善图形神经网络(GNN)的超平滑度,并增加了基于谣言检测方法的GNN的深度极限。 Twitter15和Twitter16上的实验表明,在谣言检测和早期谣言检测中,提出的模型的性能优于基线方法。
translated by 谷歌翻译
近年来,谣言对社会产生了毁灭性的影响,这使谣言发现成为重大挑战。但是,关于谣言检测的研究忽略了谣言内容中图像的强烈情绪。本文验证图像情绪是否提高了谣言检测效率。提出了由视觉和文字情绪组成的谣言检测中的多模式双重情感特征。据我们所知,这是第一个在谣言检测中使用视觉情感的研究。实际数据集上的实验验证了所提出的功能是否优于最先进的情感功能,并且可以在谣言探测器中扩展,同时提高其性能。
translated by 谷歌翻译
Existing federated classification algorithms typically assume the local annotations at every client cover the same set of classes. In this paper, we aim to lift such an assumption and focus on a more general yet practical non-IID setting where every client can work on non-identical and even disjoint sets of classes (i.e., client-exclusive classes), and the clients have a common goal which is to build a global classification model to identify the union of these classes. Such heterogeneity in client class sets poses a new challenge: how to ensure different clients are operating in the same latent space so as to avoid the drift after aggregation? We observe that the classes can be described in natural languages (i.e., class names) and these names are typically safe to share with all parties. Thus, we formulate the classification problem as a matching process between data representations and class representations and break the classification model into a data encoder and a label encoder. We leverage the natural-language class names as the common ground to anchor the class representations in the label encoder. In each iteration, the label encoder updates the class representations and regulates the data representations through matching. We further use the updated class representations at each round to annotate data samples for locally-unaware classes according to similarity and distill knowledge to local models. Extensive experiments on four real-world datasets show that the proposed method can outperform various classical and state-of-the-art federated learning methods designed for learning with non-IID data.
translated by 谷歌翻译
Existing measures and representations for trajectories have two longstanding fundamental shortcomings, i.e., they are computationally expensive and they can not guarantee the `uniqueness' property of a distance function: dist(X,Y) = 0 if and only if X=Y, where $X$ and $Y$ are two trajectories. This paper proposes a simple yet powerful way to represent trajectories and measure the similarity between two trajectories using a distributional kernel to address these shortcomings. It is a principled approach based on kernel mean embedding which has a strong theoretical underpinning. It has three distinctive features in comparison with existing approaches. (1) A distributional kernel is used for the very first time for trajectory representation and similarity measurement. (2) It does not rely on point-to-point distances which are used in most existing distances for trajectories. (3) It requires no learning, unlike existing learning and deep learning approaches. We show the generality of this new approach in three applications: (a) trajectory anomaly detection, (b) anomalous sub-trajectory detection, and (c) trajectory pattern mining. We identify that the distributional kernel has (i) a unique data-dependent property and the above uniqueness property which are the key factors that lead to its superior task-specific performance; and (ii) runtime orders of magnitude faster than existing distance measures.
translated by 谷歌翻译
Natural Language Processing (NLP) has been revolutionized by the use of Pre-trained Language Models (PLMs) such as BERT. Despite setting new records in nearly every NLP task, PLMs still face a number of challenges including poor interpretability, weak reasoning capability, and the need for a lot of expensive annotated data when applied to downstream tasks. By integrating external knowledge into PLMs, \textit{\underline{K}nowledge-\underline{E}nhanced \underline{P}re-trained \underline{L}anguage \underline{M}odels} (KEPLMs) have the potential to overcome the above-mentioned limitations. In this paper, we examine KEPLMs systematically through a series of studies. Specifically, we outline the common types and different formats of knowledge to be integrated into KEPLMs, detail the existing methods for building and evaluating KEPLMS, present the applications of KEPLMs in downstream tasks, and discuss the future research directions. Researchers will benefit from this survey by gaining a quick and comprehensive overview of the latest developments in this field.
translated by 谷歌翻译
Autonomous robotic surgery has advanced significantly based on analysis of visual and temporal cues in surgical workflow, but relational cues from domain knowledge remain under investigation. Complex relations in surgical annotations can be divided into intra- and inter-relations, both valuable to autonomous systems to comprehend surgical workflows. Intra- and inter-relations describe the relevance of various categories within a particular annotation type and the relevance of different annotation types, respectively. This paper aims to systematically investigate the importance of relational cues in surgery. First, we contribute the RLLS12M dataset, a large-scale collection of robotic left lateral sectionectomy (RLLS), by curating 50 videos of 50 patients operated by 5 surgeons and annotating a hierarchical workflow, which consists of 3 inter- and 6 intra-relations, 6 steps, 15 tasks, and 38 activities represented as the triplet of 11 instruments, 8 actions, and 16 objects, totaling 2,113,510 video frames and 12,681,060 annotation entities. Correspondingly, we propose a multi-relation purification hybrid network (MURPHY), which aptly incorporates novel relation modules to augment the feature representation by purifying relational features using the intra- and inter-relations embodied in annotations. The intra-relation module leverages a R-GCN to implant visual features in different graph relations, which are aggregated using a targeted relation purification with affinity information measuring label consistency and feature similarity. The inter-relation module is motivated by attention mechanisms to regularize the influence of relational features based on the hierarchy of annotation types from the domain knowledge. Extensive experimental results on the curated RLLS dataset confirm the effectiveness of our approach, demonstrating that relations matter in surgical workflow analysis.
translated by 谷歌翻译
Deep learning-based methods have achieved significant performance for image defogging. However, existing methods are mainly developed for land scenes and perform poorly when dealing with overwater foggy images, since overwater scenes typically contain large expanses of sky and water. In this work, we propose a Prior map Guided CycleGAN (PG-CycleGAN) for defogging of images with overwater scenes. To promote the recovery of the objects on water in the image, two loss functions are exploited for the network where a prior map is designed to invert the dark channel and the min-max normalization is used to suppress the sky and emphasize objects. However, due to the unpaired training set, the network may learn an under-constrained domain mapping from foggy to fog-free image, leading to artifacts and loss of details. Thus, we propose an intuitive Upscaling Inception Module (UIM) and a Long-range Residual Coarse-to-fine framework (LRC) to mitigate this issue. Extensive experiments on qualitative and quantitative comparisons demonstrate that the proposed method outperforms the state-of-the-art supervised, semi-supervised, and unsupervised defogging approaches.
translated by 谷歌翻译
Code generation models have achieved impressive performance. However, they tend to be brittle as slight edits to a prompt could lead to very different generations; these robustness properties, critical for user experience when deployed in real-life applications, are not well understood. Most existing works on robustness in text or code tasks have focused on classification, while robustness in generation tasks is an uncharted area and to date there is no comprehensive benchmark for robustness in code generation. In this paper, we propose ReCode, a comprehensive robustness evaluation benchmark for code generation models. We customize over 30 transformations specifically for code on docstrings, function and variable names, code syntax, and code format. They are carefully designed to be natural in real-life coding practice, preserve the original semantic meaning, and thus provide multifaceted assessments of a model's robustness performance. With human annotators, we verified that over 90% of the perturbed prompts do not alter the semantic meaning of the original prompt. In addition, we define robustness metrics for code generation models considering the worst-case behavior under each type of perturbation, taking advantage of the fact that executing the generated code can serve as objective evaluation. We demonstrate ReCode on SOTA models using HumanEval, MBPP, as well as function completion tasks derived from them. Interesting observations include: better robustness for CodeGen over InCoder and GPT-J; models are most sensitive to syntax perturbations; more challenging robustness evaluation on MBPP over HumanEval.
translated by 谷歌翻译